Model Averaging With Holdout Estimation of the Posterior Distribution

نویسندگان

  • Alexandre Lacoste
  • François Laviolette
  • Mario Marchand
چکیده

The holdout estimation of the expected loss of a model is biased and noisy. Yet, practicians often rely on it to select the model to be used for further predictions. Repeating the learning phase with small variations of the training set, reveals a variation on the selected model which then induces an important variation of the final test performances. Thus, we propose a small modification to the k-fold crossvalidation that greatly enhances the generalization performances of the final predictor. Instead of using the empirical average of the validation losses to select a single model, we propose to use bootstrap to resample the validation losses (without retraining). The variations in the selected models induce a posterior distribution that is then used for model averaging. Comparing this novel approach to the classical cross-validation on 38 datasets with a significance test, shows that it has higher generalization performance with probability over 0.9.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

IMAGE SEGMENTATION USING GAUSSIAN MIXTURE MODEL

  Stochastic models such as mixture models, graphical models, Markov random fields and hidden Markov models have key role in probabilistic data analysis. In this paper, we have learned Gaussian mixture model to the pixels of an image. The parameters of the model have estimated by EM-algorithm.   In addition pixel labeling corresponded to each pixel of true image is made by Bayes rule. In fact, ...

متن کامل

Bayesian Sample Size Computing for Estimation of Binomial Proportions using p-tolerance with the Lowest Posterior Loss

This paper is devoted to computing the sample size of binomial distribution with Bayesian approach. The quadratic loss function is considered and three criterions are applied to obtain p-tolerance regions with the lowest posterior loss. These criterions are: average length, average coverage and worst outcome.

متن کامل

­­Image Segmentation using Gaussian Mixture Model

Abstract: Stochastic models such as mixture models, graphical models, Markov random fields and hidden Markov models have key role in probabilistic data analysis. In this paper, we used Gaussian mixture model to the pixels of an image. The parameters of the model were estimated by EM-algorithm.   In addition pixel labeling corresponded to each pixel of true image was made by Bayes rule. In fact,...

متن کامل

Improving the Performance of Bayesian Estimation Methods in Estimations of Shift Point and Comparison with MLE Approach

A Bayesian analysis is used to detect a change-point in a sequence of independent random variables from exponential distributions. In This paper, we try to estimate change point which occurs in any sequence of independent exponential observations. The Bayes estimators are derived for change point, the rate of exponential distribution before shift and the rate of exponential distribution after s...

متن کامل

1 Model Search, Selection, and Averaging.

Although some model selection procedures boil down to testing hypotheses about parameters and choosing the best parameter or a subset of parameters, model selection is a broader inferential task. It can be nonparametric, for example. Model selection sometimes can be interpreted as an estimation problem. If the competing models are indexed by i ∈ {1, 2, . . . , m}, getting the posterior distribu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012